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Docket CNTR.2114 
BACKGROUND OF THE INVENTION 
FIELD OF THE INVENTION 

(00031 The present Invention relates to clocking of 
sequential logic blocks on an integrated circnit, and more 
particularly to a method and apparatus for fine tuning and 
permanently programming dock skews of clock signals after 
identifying and analysing critical timing paths during test 
and debug. 

DESCRIPTION OF THE RELATED ART 

,0004] integrated circuit designers have employed 
simulation and/or test to identify, isolate, and analyse 
timing problems on a chip, which often resulted in a chip 
design that at best could not perform at target clock 
speeds and that at worst had to be modified prior to mass 
fabrication. Typically, register logic within each logic 
Mock of the chip is employed to transmit/receive data 
to/from a succeeding/preceding logic stage. Setup time 
problems occur when a given logic block exhibits a critical 
delay path with regard to operations performed within to 
the extent that, at a given clock speed, valid data is not 
provided to the next logic stage until after a clock edge 
occurs at the next logic stage that is intended to latch 
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the data. A hold time problem is exhibited by a logic 
chain when the given logio block provides valid data to the 
next logic block but the data becomes invalid prior to a 
clock edge occurring at the next logic block that is 
intended to latch the data. In the first case, the 
latching clock edge at the next logic stage occurs too soon 
(or the transfer of valid data. In the second case, the 
latching clock edge at the next logic stage occurs too late 
for the transfer of valid data. 

[0005] FIG. 1A is a simplified block diagram showing a 
circuit 100 with two representative successive logic blocks 
101 and 103 to which clock signals are provided. The first 
logic block 101 (LOGIC BLOCK 1) receives a first clock 
signal ELCK1 and provides data signals DATA to the second 
logic block 103 (LOGIC BLOCK 2), which receives a second 

-> mvo FIG IB is a timing diagram 

clock signal ECLK2 . 

illustrating operation of the circuit 100 for synchronized 
versus skewed clock signals. The timing diagram generally 
shows traces of the ELCK1 , ECLK2 and DATA signals versus 
time. Particular time points are shown, including times 
Tl, T2, T3, T4 and T5 occurring in sequential order in 
which time Tl occurs first and time T5 occurs last. 
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t 0006, The first two traces o£ the timing diagram show 
the case when the ECLK1 and ECLK2 clock signals are 
synchronized illustrating a setup time problem. A. shown, 
for example, the ECLK1 and ECLK2 signals have synchronous 
edg es including substantially coincident falling edges at 
time Tl and substantially coincident rising edges at time 
T3. The third trace shows the relative timing of the DATA 
signals from the first logic block 101 in which the data 
switches and becomes valid at a time T4, which is after 
time T3. The synchronized clock case illustrates a setup 
time problem in which the first logic block 10! exhibits a 
critical delay path such that valid data on the DATA 
signals at time T4 is not provided to the logic block 103 
until after the rising clock edge at time T3 occurs. 

than there is time in a clock cycle, upon the rising edge 
of ECLK2 at time T3, invalid data is clocked into the logic 
block 103. 

l0 0071 The fourth and fifth traces of the timing diagram 
show the case when the ECL2 signal is skewed relative to 
the ECLK1 signal. In particular, the fourth trace is a 
trace of the ECLK1 signal substantially similar to the 
first trace of ECLK1 . The fifth trace shows ECLK2 skewed 
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relative to ECLK1 where the (ailing edge o£ ECLK2 occurs at 
time T2 after time Tl and the subsequent rising edge of 
ECLK2 occurs at a time T5 after the time T4. The skewed 
clocks case illustrates that by delaying ECLK2 relative to 
ECLK1 , the setup time problem is eliminated. In 
particular, the rising edge of ECLK2 is delayed until after 
the DATA signals become valid, thus allowing a valid 
transfer of data from the first logic block 101 to the 
second logic block 103. 

,0008] Designers have heretofore provided hardwired 
logic to skew the clocks that are provided to sequential 
logic blocks to solve critical path and hold time problems, 
such solutions, once implemented however, provided a 
permanent clock skew fix that could not be changed without 
modifying the chip design. Also, with reference to the 
illustrated example, one of ordinary skill in the art will 
appreciate that ECLK2 can be delayed only in the event that 
there is delay margin associated with the second logic 
block 103. An alternative solution is to advance the clock 
for latching incoming data into the preceding logic block 
(e.g., logic block 101) so that more time is provided for 
the stage to perform its work. This alternative solution 
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is not always feasible, however, and may result in new and 
unforeseen timing problems. 

[0009] As a matter of practice, designers analyze and 
simulate complex logic paths in an integrated circuit prior 
to committing a design to production. But one skilled in 
the art will appreciate that slight differences in clock 
skew cannot be simulated with sufficient accuracy and 
production process variations furthermore cannot be 
precisely modeled. Hence, most fabricated integrated 
circuits often exhibit a number of unanticipated critical 
timing paths that designers are forced to address prior to 
shipment. Consequently, any setup time problems that occur 
after-the-fact (i.e., after the chip is fabricated and 
permanent clock skews have been set) can only be eliminated 
by slowing down the clock speed of the device. worse yet, 
after-the-fact hold time problems render a design totally 
inoperative. m either case, significant design 

modifications (often including changes to masks, electron 
beam analysis, etc.) are required to fix these types of 
problems . s 
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SUMMARY OF THE INVENTION 

[0010] An integrated circuit (IC) according to an 
embodiment of the present invention includes skew- 
programmable clock buffers, fixed skew logic, an external 
interface and a skew controller. Each skew-programmable 
clock buffer receives a distributed clock signal and 
provides a corresponding local clock signal having a 
programmed skew. The fixed skew logic enables permanent 
programming of static skew values and the external 
interface enables programming of dynamic skew values. The 
skew controller selects between the static and dynamic skew 
values and programs the skew-programmable clock buffers 
based on selected skew values. 

[0011] m one embodiment, the skew controller is 
operative to detect a skew over-ride command upon reset of 
the IC and to select between the static and dynamic skew 
values based on the skew over-ride command. The 
programmable memory may be integrated on the IC or 
externally coupled via the external interface. The fixed 
skew logic is implemented as any type of permanent 
programmable block, such as laser-blown fuses, an EPROM , 
etc . 
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[0012] A method of tuning clock skews for an IC 
according to an embodiment of the present invention 
includes determining, by the IC upon reset, whether a skew 
over-ride command is provided, selecting skew values from 
fixed skew logic integrated on the IC if the skew over-ride 
command is not provided, selecting skew values from a skew 
memory if the skew over-ride command is provided, 
programming at least one programmable delay block 
integrated on the IC based on selected skew values, and 
receiving, by each delay block, a distributed clock signal 
and providing at least one local clock signal having a skew 
based on a selected skew value. 

[0013] The method may include integrating the skew 
memory as a dynamic memory on the IC and programming the 
skew memory via an external interface. The method may 
further include programming a skew over-ride bit on the IC 
and reading the skew over-ride bit upon reset of the IC. 
The method may further include holding the IC in reset 
while programming the skew memory and the skew over-ride 
bit. The method may further include testing the IC 
programmed with the dynamic skew values, repeating the 
programming and testing to determine an optimum set of skew 
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values, and programming the fixed skew logic with the 
optimum set of skew values. 

[0014] A system for fine tuning clock signals of an IC 
according to an embodiment of the present invention 
includes a permanent programmable block for permanently 
programming at least one fixed skew value, programmable 
logic for storing at least one dynamic skew value, at least 
one clock buffer, and a skew controller that selects 
between the fixed skew value and the dynamic skew value and 
that programs the clock buffer using the selected skew 
value. Each clock buffer includes programmable delay logic 
that delays a clock signal based on a selected skew value. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0015] The benefits, features, and advantages of the 
present invention will become better understood with regard 
to the following description, and accompanying drawings 
where : 

[0016] FIG . 1A is a simplified block diagram showing a 
circuit with two representative successive logic blocks and 
to which corresponding clock signals are provided; 
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[0017] FIG. IB is a timing diagram illustrating 
operation of the circuit of FIG. 1A for synchronized versus 
skewed clock signals; 

[0018] FIG. 2 is a block diagram of an integrated 
circuit incorporating a clock skew tuning system according 
to an exemplary embodiment of the present invention; 

[0019] FIG. 3 is a more detailed block diagram of an 
exemplary embodiment of each of the local skew controllers 
of FIG. 2; and 

[0020] FIG. 4 is a schematic diagram of an exemplary 
embodiment of a local clock buffer that may be used within 
the local clock buffer array of FIG. 3 for each of the 
local clock signals. 

DETAILED DESCRIPTION 

[0021] The following description is presented to enable 
one of ordinary skill in the art to make and use the 
present invention as provided within the context of a 
particular application and its requirements. Various 
modifications to the preferred embodiment will, however, be 
apparent to one skilled in the art, and the general 
principles defined herein may be applied to other 
embodiments. Therefore, the present invention is not 
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intended to be limited to the particular embodies shown 
and described herein, but is to be accorded the widest 
scope consistent with the principles and novel features 
herein disclosed. 

[0022] The inventors of the present application have 
recognized the need for providing designers an apparatus 
and method for dynamically controlling the skew of local 
clocks on an integrated circuit (IC) during test or debug 
and for programming optimum clock skews into a fabricated 
device. They have therefore developed a method and 
apparatus for fine tuning clock signals on a fabricated 
device and for permanently establishing optimized clock 
skews to maximize speed and to compensate for unanticipated 
problems after-the-fact, as will be further described below 
with respect to FIGURES 2-4. 

[0023] The present invention provides an apparatus and 
method for dynamically controlling the skew of local clocks 
on an integrated circuit (IC) during test/debug and for 
programming optimum clock skews into a fabricated device, 
in the absence of programmed skews, upon power up the 
device utilizes clock skews that have been programmed into 
the chip itself via a fixed skew logic block, such as fuses 
or an EPROM (Electrically Programmable Read-Only Memory) or 
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the like. Clock skews for test are stored in a skew 
memory, and a skew controller integrated on the chip is 
directed to utilize the loaded skews upon reset. The skew 
memory may be a dynamic memory integrated on the chip, 
which is loaded via an external interface. The skew for 
each local clock can be delayed up to a maximum amount in 
predetermined increments. Once an optimal set of clock 
skews are determined, skew values are permanently stored on 
chip in the fixed skew logic. By providing dynamically 
programmable local clock skew and means for permanently 
establishing optimized clock skews, the speed of a given 
part can be maximized and compensation can be provided for 
unanticipated problems after-the-fact, or after the part is 
fabricated. In this manner, the part can be optimized and 
otherwise inoperative parts can be rendered operative and 
optimized. 

[0024] FIG. 2 is a block diagram of an integrated 
circuit (IC) 200 incorporating a clock skew tuning system 
according to an exemplary embodiment of the present 
invention. In the exemplary embodiment illustrated, the IC 
200 is a microprocessor, although it is understood that the 
present invention applies to any type of circuitry or 
function fabricated onto a chip. As known to those of 
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ordinary skill in the art, the IC 200 includes a clock 
generator 201 which provides a primary clock signal 
referred to as CORE CLK. The CORE CLK signal is provided 
to a clock distribution network 203, which provides 
multiple distributed copies or versions of the CORE CLK 
signal, individually shown as signals EE CLK1, EE CLK2, . . 

EE CLKN, where «N» is a positive integer. Each of the 
distributed EE CLKx signals (where «x» is an integer from 1 
to N) is provided to a corresponding cne of multiple logic 
blocks 205 incorporated onto the IC 200. The logic blocks 
205 are individually shown as LOGIC BLOCK 1, LOGIC BLOCK 2, 
LOGIC BLOCK N, where any appropriate number "N" of 
the logic blocks 205 is contemplated depending upon the 
particular function integrated onto the IC 200. The logic 
blocks 205 generally represent the major logic blocks of 
the device implemented on the IC 200. 

[0025] Each of the logic blocks 205 includes or is 
otherwise associated with a corresponding one of multiple 
local skew controllers 207, individually shown as LOCAL 

CTRL 1, LOCAL CTRL 2 LOCAL CTRL N. Each of the 

local skew controllers 207 receives a corresponding one of 
the EE CLKx signals and provides a corresponding set of one 
or more local or »E level" clock signals referred to as E 
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CLKSx, where each clock set is individually shewn as E 

CLKS1, E CLKS2 ■ CLKSN . Each E CLKSx represents a 

set of one or more local clock signals as further described 
below. The EE CLKx signals are generally synchronized 
(i.e., coincident edges), although respective timing may 
vary depending upon process variations and other physical 
factors, such as location of given CLKx signals on the 
chip, capacitive coupling, signal trace characteristics, 
etc. The local skew controllers 207 insert a programmed 
skew into each local clock signal of each E CLKSx set so 
that their relative timing depends upon the programmed 
skews . 

[0 026] The IC 200 includes a skew controller 209, which 
outputs one or more delay values (DV) on a DV signal line 
provided to each of the local skew controllers 207. In one 
embodiment, the delay values are provided in the form of a 
serial stream of binary encoded delay bits. The IC 200 
also includes a fixed skew logic block 211, which outputs 
fixed skew values (PSV) to the skew controller 209 via one 
or more FSV signal lines. The IC 200 further includes test 
logic 213 coupled to the skew controller 209. The test 
logic 213 is externally accessible via an external test 
port 215 provided on the IC 200. The test port 215 may 
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include any nnotoer of external pins of the IC as known to 
those skilled in the art. The external pins may be 
dedicated test pins or dual purpose pins as known to those 
of ordinary skill in the art. A chip tester 217 is coupled 
to the IC 200 via the test port 215, which provides a 
communication interface between the chip tester 217 and the 
test logic 213. 

10027] The test logic 213 and the test port 215 may be 
imp lemented according to the JTAG (Joint Test Action Group) 
in which the test logic 213 comprises JTAG logic. In a 
standard test configuration, the chip tester 217 includes a 
socket (not shown) or the like for receiving the IC 200. 
The chip tester 217 is able to control the external pins of 
the IC 200, including a reset (RST) pin via a signal R. In 
a typical JTAG configuration, for example, the chip tester 
217 powers up the IC 200 and holds it in reset by asserting 
the R signal while configuring the IC 200 for test. During 
the test configuration while the IC 200 is held in reset, 
the chip tester 217 is able to access the test logic 213 
via the test port 215, and set values or bits in selected 
registers or even load test data and/or routines for test 
and debug purposes. The chip tester 217 releases the R 
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signal to pull the IC 200 out of reset, while the chip 
tester 217 monitors its operation. 

[0028] In accordance with an embodiment of the present 
invention, the chip tester 217 holds the IC 200 in reset 
while loading dynamic clock skew values into a skew memory 
221 coupled to the skew controller 209 via dynamic skew 
value (DSV) signal lines within the IC 200. The skew 
memory 221 may be implemented as any type of dynamic or 
volatile memory device, or any type of programmable logic 
that maintains its state while power is supplied to the IC 
200. In the configuration shown, the test logic 213 
enables external access to the skew memory 221 via the skew 
controller 209. It is appreciated, however, that if the 
skew memory 221 is integrated onto the IC 200, it may be 
located anywhere and may be coupled to enable external 
access in any convenient manner, such as provided within or 
otherwise directly coupled to the test logic 213. A RESET 
signal is shown provided to the skew controller 209, which 
holds it inactive while the IC 200 is held in reset. When 
the chip tester 217 releases the R signal to allow the IC 
200 to come out of the reset condition, the skew controller 
209 is configured to determine if a skew over-ride command 
is provided, and if so, to retrieve the dynamic skew values 
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stored within the skew memory 221. The skew controller 209 
uses the skew values to generate the delay values provided 
to program the local skew controllers 207 via the DV signal 
line. in this manner, a designer may program any skew into 
any skew-programmable local clock buffer on the IC 200 for 
purposes of test and/or debug. 

[0029] Upon power up or reset and in the absence of 
direction from the test logic 213 (e.g., if the skew over- 
ride command is not provided), the skew controller 209 
retrieves the fixed skew values from the fixed skew logic 
211 for purposes of programming the local skew controllers 
207. The fixed skew logic 211 is implemented in any 
suitable manner, such as multiple programmable fuses or an 
EPROM or any other type of permanent programmable block 
having the ability to retain its programmed state when 
power to the chip is removed. An initial configuration for 
fuses is that none of the fuses are blown, which results in 
zero skew delay for each local clock signal of each E CLKSx 
set. Alternatively, a portion of the fuses are blown to 
provide nominal skews. For an EPROM embodiment, the EPROM 
may be initially programmed with data, such as data 
associated with zero or nominal delays. 
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(0030, The skew controller 209 is configured, upon power 
up or reset, to determine whether the skew over-ride 
co»and is provided. It the skew over-ride command is not 
provided, the skew controller 209 selects the fixed skew 

over-ride command is provided, the skew controller 209 
selects the dynamic skew values stored in the skew memory 
Ml. in either case, the selected skew values are used to 
generate the delay values asserted on the DV signal line. 

, -e^v t-b^ skew values and the 

Any suitable configuration for the skew 

corresponding delay values is contested. In one 
e^odiment, the skew values and delay values are the same, 
in which case the hits of the skew values are accessed in 
ord er and directly serially shifted out of the skew 
controller 209 onto the DV signal line. 

[0031] The skew over-ride command may be implemented in 
any appropriate manner. As illustrated, for example, the 
skew over-ride command is programmed into the skew memory 
221 as one or more skew over-ride command (SOC) bit(s) 223. 
In this case, the skew controller 209 is configured to read 
the SOC bit 223 to determine the presence of dynamic skew 
values, and if so, to retrieve the dynamic skew values from 
the skew memory 221 rather than the static skew values from 
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on alternatively, the skew over- 
the fixed skew logic 211. Alternative y, 

H» asserted by the test logic 213 
ride command may be asserted uy 

including a programed SOC bit. The skew over-ride command 
m ay alternatively be implemented by an SOC bit located in 
any programmable register within the IC 200. A 
m icroprocessor chip, for example, includes several multi- 
purpose registers that may used for this purpose. 
[0032] in yet another embodiment, the chip tester 217 
asserts the skew over-ride command via the external test 
port 215 during the reset process. In this case, the skew 
controller 209 is configured to monitor the test port 215, 
either directly or via the test logic 213. In yet another 
embodiment, the skew memory 221 is provided on the chip 
tester 217, shown with dashed lines, rather than being 
integrated on the IC 200. This embodiment provides the 
benefit of minimizing memory on the IC 200, if desired, 
such as for chips incorporating relatively simple functions 
with limited space. In this off -chip configuration, when 
coming out of reset, the skew controller 209 is configured 
to retrieve the dynamic skew values from the test logic 
213, which in turn retrieves them from the chip tester 217. 
The skew over-ride command or SOC bit may be implemented 
on-chip or off-chip as previously described. 
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C00331 By means o£ the chip tester 217, clocK skews or 

programed by the test logic 213 during test or debug. In 
this manner, a designer dynamically changes the local clock 
delays during testing of the IC 200 to identify critical 
timing paths, to analyse timing problem, and/or to 
determine an optimum set of local clock skews that maximise 
the speed of the IC 200 or that otherwise renders the IC 
2 „0 operative which would have otherwise been scrapped. 
After determination of an optimum set of local clock skews, 
the optimum skews are permanently programmed into the IC 
200, such as programming (or re-programming) of the fixed 
skew logic 211. Thereafter, upon initialization (reset or 
po „er-up, of the IC 200, the skew controller 20. utilizes 
the optimal set of clock skews programmed into the fixed 
skew logic 211 to program the local skew controllers 207. 
,0034] It is noted that depending upon the configuration 
o£ the fixed skew logic 211, it may be re-programmed again 
if desired to program a new set of skew delay values. Such 
re-programming may not be available for one-time 
programmable devices, such as laser-blown fuses. It is 
also appreciated that regardless of where the dynamic skew 
me mory used for test and/or debug is located, when the chip 
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tester 217 is disconnected, each time the IC 200 is powered 
up or reset, the skew controller 209 retrieves the fixed 
delays programmed into the fixed skew logic 211 and 
programs the local skew controllers 207 accordingly. 

[0035] FIG. 3 is a more detailed block diagram of an 
exemplary embodiment of each of the local skew controllers 
207. The DV signal line is provided to an input of delay 
intercept logic 301, where the DV signal line is also 
provided to any subsequent local skew controllers 207. The 
delay intercept logic 301 provides M sets of encoded delay 
bits to a local clock buffer array 303, which outputs M 

different local clock signals E CLK0, E CLK1 E 

CLKM. The number »M'< is any suitable positive integer 
representing the number of local clock signals generated 
for the particular one of the local skew controllers 207. 

[0036] In the embodiment shown, each set of encoded 
delay bits includes 3 true bits CAPO, CAP1 and CAP2 and a 
corresponding 3 complementary bits CAPB0, CAPB1 and CAPB2 , 
where a «B» appended in the signal name denotes the logic 
complementary bit. As shown, the CAPi bits are grouped 
together (where i is an index integer from 0 to 2) and the 
CAPBi bits are also separately grouped together so that the 
delay intercept logic provides bit sets CAP2_[M:0], 
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CAP1_[M:0], CAPO_[M:0] and complementary bit sets 
CAPB2_[M:0], CAPB1_[M:0], CAFB0_[M:0] to the local clock 
buffer array 303. A corresponding one of the EE CLKx 
signals is also provided to the local clock buffer array 



303 . 



[0037] The DV signal may include any number of bits for 
conveying the delay values to the delay intercept logic 301 
of each of the local skew controllers 207. In the 
exemplary embodiment shown, the DV signal is a single bit 
line and the skew controller 209 outputs a serial stream of 
binary encoded bit values including a delay value for each 
and every one of the local clock signals E CLKy , where 
(where »y» is an integer from 1 to M) . In one embodiment, 
for example, if N = 10 (i.e., 10 logic blocks 205) and if N 
is 5 for each of the local skew controllers 207, then there 
is a total of 50 local clock signals. If each delay value 
is 3 bits, then the skew controller 209 provides a serial 
bit stream of at least 150 bits to program each local clock 
signal. Each delay intercept logic 301 is configured to 
extract a corresponding 15 bits from the delay value 
stream, including 3 encoded bits for each of its 5 local 
clock signals. Each delay intercept logic 301 is further 
configured to output the extracted delay bits and their 
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cedents as the CAPi.CM.O, and MBl.W.O] signals to 
its loca! clock buffer array 303. For the 3-bit example, a 
del ay value of DV - 001b (where an appended -b- denotes a 
binary number, , then the delay bits includes a true set of 
del ay bits 001b and a complementary set of delay bits 110b. 
[0038! FIG. 4 is a schematic diagram of an exemplary 

within the local clock buffer array 303 for each of the 
local clock signals E CLKy. A representative EE CLKx 
signal is provided to the input of an inverter/buffer 40!, 
which asserts a signal HO at its output to the input of 
another inverter/buffer 403. The inverter/buffer 403 
asserts a signal INI at its output, which is coupled to the 
input of another inverter/buffer 405. The inverter/buffer 
405 asserts a corresponding local clock signal E CLKy at 
its output. It is noted that the E CLKy signal is inverted 
relative to the corresponding EE CLKx signal given an odd 
number of inverters in the embodiment shown. An additional 
inverter/buffer (not shown) may be used to invert the clock 
signal again, or any of the inverter/buffers 401, 403, or 
405 may be configured as a buffer if desired. 
[00391 The CAPO signal is provided to the gates of two 
M-channel devices HI and N2 and the CAPB0 signal is 
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provided to the gates of two P-channel devices PI and P2. 
The CAP1 signal is provided to the gates of a pair of N- 
channel devioes N3 and N4 (shown collectively as N4.N3) and 
to the gates of another pair of N-channel devioes N5 and N6 
(sh o„n collectively as K 6:N 5). The CAPB1 signai is 
provided to the gates of a pair of P-ohannei devioes P3 and 
P4 (shown oolleotively as P4;P3> and to the gates of 
another pair of P-channel devices P5 and P6 (shown 
oollectively as P6,P5). The CAP2 signal is provided to the 
gates of an array of four N-channel devioes N7, N8, N9 and 
H10 (shown collectively as »10:N7) and to the gates of 
another array of four N-channel devices Nil, N12, N13 and 
„« (shown collectively as N14:N11>. The CAPB2 signal is 
provided to the gates of an array of four P-channel devices 
,7, P8. P9 and P10 (shown collectively as P10 : P7) and to 
the gates of another array of four P-channel devices PU, 
P12, P13 and P14 (shown collectively as P14:P11) . 

10040] The drains of the N-channel devices Ml. N3, N4 
and N7 - NIC and drains of the P-channel devices PI, P3, P4 
and p7 _ pio are all coupled to the output of the 
inverter/buffer 401. The drains of the N-channel devices 
N2. N5, N6, and Nil - N14 and drains of the P-channel 
devices P2, PS. VS. and Pll - P14 are all coupled to the 
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output of the inverter/buffer 403. The sources of all of 
the N-channel devices Nl - H14 and the P-channel devices PI 
- P14 are floated (floating sources) so that the output of 
the inverter/buffers 401, 403 see the channel and source 
capacitances of each of the devices coupled thereto during 
signal transitions. 

[0041] The exemplary local clock buffer 400 is 
implemented using sequentially-coupled buffers with one or 
more intermediate nodes, and binarily-distributed N-channel 
and P-channel arrays coupled to one or more of the 
intermediate nodes to effectuate a digitally controlled 
skew. in particular, the N-channel devices Nl, N4:N3 and 
N10:N7 form a binarily-distributed N-channel array and the 
P-channel devices PI, P4:P3 and P10:P7 form a corresponding 
complementary and binarily-distributed P-channel array. In 
a similar manner, the N-channel devices N2, N6:N5 and 
N14:N11 form another binarily-distributed N-channel array 
and the P-channel devices P2, P6-.P5 and P14:P11 form 
another corresponding complementary and binarily- 
distributed P-channel array. In the embodiment shown, two 
stages of delay are provided from the EE CLKx signal to the 
E CLKy signal to compensate for different turn on and turn 
off characteristics of P- and N-channel devices. For 
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example, the N-channel device turn off characteristics in 
the second stage (INO to INI) compensate for the P-device 
turn on characteristics in the first stage (EE CLKx to 
INO) . The particular configuration of each of the clock 
buffers is exemplary only and any other type of digitally 
programmable delay logic known to those of skill in the art 
is contemplated. 

[0042] The CAP0/CAPB0 signals control one set of N- and 
P-channel devices (Nl/Pl and N2/P2) , the CAPl/CAPBl signals 
control two arrayed sets of like N and P devices 
(N4:N3/P4:P3 and N6 :N5/P6 : P5) , and the CAP2/CAPB2 signals 
control four arrayed sets of N and P devices (N10 -.N7/P10 : P7 
and N14:N11/P14:P11) . In this manner, asserting the CAP 2 
signal results in a delay of the EE CLKx signal that is 
four times that exhibited by asserting the CAPO signal. 
Each of the P- and N-channel device pairs are matched to 
provide substantially the same resistive/capacitive (RC) 
characteristics, resulting in a relatively constant delay 
increment for each asserted step of the 3 -bit encoded 
signal. In one embodiment, each delay increment is 
approximately 6 picoseconds (pa) . Thus, when the CAPO 
signal is asserted high (logic 1) while the CAPBO signal is 
asserted low (logic 0), a 6 ps delay is added between the 
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input EE CLKx signal and the output E CLKy signal. In a 
similar manner, when the CAP1 signal is asserted high while 
the CAPBl signal is asserted low, a 12 ps delay is added, 
and when the CAP 2 signal is asserted high while the CAPB2 
signal is asserted low, a 24 ps delay is added. For 
example, a value of CAP2:CAP0 = 011b equates to an overall 
clock delay of approximately 18 ps through the local clock 
buffer 400. In summary, a delay from 0 to 40 ps may be 
added in 6 ps increments using the 3 -bit encoded delay 
CAP2:CAP0 signals. One skilled in the art will appreciate 
that asserting the CAPx and CAPBx signals results in delay 
as has heretofore been described due to the formation of 
channel-to-gate capacitances that are seen by the gates of 
corresponding devices when their associated CAPx and CAPBx 
signals are asserted. For example, an N-channel device 
sees a channel-to-gate capacitance because, when its gate 
is high, an inversion channel forms from its drain to its 
source, thus forming the channel-to-gate capacitance that 
is seen through the drain of the N-channel device. If the 
gate of the N-channel device is low, an inversion is not 
formed. For instance, in considering device Nl (neglecting 
parasitic capacitances), if CAPO is high, signal INO sees 
the gate capacitance, source capacitance, and drain 
capacitance of Nl . But if CAPO is low, then signal INO 
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sees only the drain capacitance of Nl. Delays similarly 
result from P-channel devices configured as shown in FIG. 4 
when their corresponding CAPBx signals are asserted low. 

[0043] One advantage of a clock skew tuning systems 
according to embodiments of the present invention is that 
the local clock skews of a fabricated part may be optimally 
determined during test of the part to maximize the clock 
speed of the part. Another advantage of the present 
invention is that a means is provided for debugging 
critical path problems and determining local clock skews 
that correct the critical path problems for parts that have 
heretofore been scrapped. A third advantage is that 
another technique is now provided for quantitatively 
measuring and compensating for the effects of process 
variations via simple JTAG analysis techniques as opposed 
to complex techniques (e.g. electron beam analysis) that 
would otherwise result in modifications to the layout of 
the chip. The determined local clock skews that render the 
part useful and/or optimize speed of the part may then be 
permanently programmed via any suitable permanent 
programming means provided on the chip, such as fuses or an 
EPROM or the like. 
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[0044] Although the present invention has been described 
in considerable detail with reference to certain preferred 
versions thereof, other versions and variations are 
possible and contemplated. For example, additional stages 
or additional levels of arrayed N-channel and P-channel 
devices may be provided to add further delay if desired. 
Moreover, although the present disclosure contemplates 
application to metal-oxide semiconductor (MOS) type 
devices, including complementary MOS devices and the like, 
such as, for example, NMOS and PMOS transistors, it may 
also be applied in a similar manner to analogous types of 
technologies and topologies, such as bipolar devices and 
the like. 

[0045] Finally, those skilled in the art should 
appreciate that they can readily use the disclosed 
conception and specific embodiments as a basis for 
designing or modifying other structures for carrying out 
the same purposes of the present invention without 
departing from the spirit and scope of the invention as 
defined by the appended claims. 

[0046] What is claimed is: 
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